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We present an approximate analytic study of our previously introduced model of evolution includ- 
ing the effects of genetic exchange. This model is motivated by the process of bacterial transforma- 
tion. We solve for the velocity, the rate of increase of fitness, as a function of the fixed population 
size, N. We find the velocity increases with IniV, eventually saturated at an N which depends on 
£SJ ' the strength of the recombination process. The analytical treatment is seen to agree well with direct 

_ ' numerical simulations of our model equations. 

Oh' 

on 



i 



O 

in 



I. INTRODUCTION 



Recombination of genetic information is an important strategy employed by biological systems to foster evolutionary 
novelty and to mitigate the adverse effects of deleterious mutations Q| . It is therefore critical to understand how the 
recombination details interact with factors such as mutation and finite population size so as to determine the overall 
Oh i Darwinian dynamics. Theoretical concepts and models can be directly tested by comparison with laboratory-scale 
experiments on microorganisms [2j , especially those that can switch from asexual to sexual reproduction as a function 
of controllable conditions 0, • 

There has of course been a great deal of work on both the effects of the exchange of genetic information and on the 
evolution of sex 0,0, 00- Yet, analytically tractable approaches which can address the role of finite population size 
(and the resultant linkage of different loci) within a full genomic model are still lacking 9J . For example, most work to 
I | date focuses on systems with just two loci, although this limit is not at all suitable for the majority of microorganism 
systems. Including recombination in the simple landscape models [ToL llll Il2| that have proven useful for asexual 
' evolution 0] is, we feel, the first step in this necessary direction. 

The purpose of this paper is to present a detailed analytical investigation into a previously introduced Q model 
of recombination motivated by the phenomenon of genetic exchange by competent bacteria. Under proper conditions, 
many species of bacteria can import snippets of DNA from the surrounding medium; presumably these are then 
homologously recombined so as to replace the corresponding segments in the genome . Most biologists are convinced 
that this process serves to enhance genetic diversity and thereby allows for better response to environmental challenges 
faced by the colony 0, . Our work addresses the conditions under which this type of genetic exchange is likely to 
2 ' be beneficial. 
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II. EVOLUTIONARY MODEL 

Our model 14] consists of a population of N individuals each of which has a genome of L binary genes. An 
individual's fitness x depends additively on the genome x — 2j=i &i with S = 0, 1. Evolution is implemented as 



a continuous time Markov process in which individuals give birth at the rate x and die at random so as to maintain 
the fixed population size. Every birth allows for the daughter individual to mutate each of its alleles with probability 
fio giving an overall genomic probability of ji = /j.qL. In addition, we add a process which mimics the aforementioned 
method of recombination. At the rate f s L, an individual has one of its genes deleted and instead substitutes in a new 
allele from the surrounding medium; the probability of getting a specific S is just its proportional representation in 
the population. This last assumption should be valid as long as the distribution of recently deceased (and lysed) cells 
is close to that of the current population; this should be the case whenever the random killing due to a finite carrying 
capacity is the most common reason for death. 

The aforementioned Markov process is much too complicated to be solved exactly. In our previous work, we carried 
out a set of simulations to address the effect of finite values of f s . We showed that at very small population sizes, 
recombination has little effect, since there is no population diversity upon which to act. At large N, the rate of 
evolutionary advance is much higher and again roughly independent of the recombination rate; this is because the 
effects of linkage disappear. This would occur even without recombination, albeit at values of N that are unattainably 
large, even in viral experiments [TT| . Most importantly, the population scale for the rise is a strongly decreasing 
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function of / s , and hence recombination at intermediate N can give a dramatic speedup of the evolution. This basic 
result is qualitatively consistent with recent experiments 0, in microorganism evolution which demonstrate an 
increase in the efficacy of recombination as the population size is increased (starting from small). 

In order to approach this system analytically, we wish to derive an effective equation governing the fitness distribu- 
tion of the population as a function of time. In the absence of recombination, it has been convincingly demonstrated 
that this can be accomplished by modifying the naive mean-field theory (aka the Eigen-Schuster quasispecies equa- 
tions 0] ) by adding a cutoff on the birth rate if the population density near the leading edge goes below P, = k/N, 
for some k of O(l). The key to this idea, introduced independently in several different contexts |loL fTgL I20L I21L 12^. l2^ | . 
is that the major effect of finite population size is to prevent the leading edge of most-fit individuals from spreading 
too far and too fast, as in reality there must be at least one individual (out of iV) at a certain fitness for the equation 
to make sense. This notion leads to two different equations for the dynamics of the fitness distribution function, 
depending on the size of P. First, if P is larger than P c , we have 



dP x (t) 
dt 



(x - \)P x (t) + ti 



P x+1 (t) + (x - 1) 1 — P x -i{t) - xP x {t) 



S x (1) 



The first two terms reflect the birth-death process and the genomic mutation respectively; the explicit form of the 
mutation term arises from considering the probability of an individual with fitness x giving birth (rate ~ x), mutating 
(~ /i), and hence going either up (~ (1 — x/L), the number of currently bad alleles) or down (~ x/L, the number of 
good alleles). Alternatively, if P falls below P c we drop the birth-death term (x — X)P X in the above equation. As 
already mentioned, it is only important to impose this cutoff on the leading edge of the population, not at the trailing 
edge. Finally, A is a Lagrange multiplier arranged so as to maintain the total population size at N; 

J dxxP x 0(P x - P c ) 
~ JdxP x 6(P x -P c ) 

For large N, A is essentially just the population mean fitness x and we will use this in what follows. 

The last term, S x , reflects the recombination effect. In our previous work, we verified that at all but smallest values 
of N it was reasonable to assume that the subpopulation at some particular fitness x had the same distribution of '0' 
and T' alleles at each locus. This assumption means that selecting at random an allele from the extracellular medium 
gives the site-independent chance x/L of getting S — 1 and 1 — x/L of getting S — 0. Following the above logic, this 
gives 



/ X \ X + 1 _, , , XI x — 1 \ _ , s 



(2) 



The remainder of this paper is devoted to solving this equation, using the fact that N is large. Of particular interest 
will be the typical scale of f s that is needed to recover the fast evolutionary advance expected in the no-linkage limit. 



III. APPROXIMATE PULSE SOLUTION 



We are interested in propagating solutions in which the fitness distribution function takes the form of a localized 
pulse with a mean fitness and a typical width 0,0]. First, we will replace the spatial finite-difference equation by a 
PDE by taking the continuous space limit. Later, we will return briefly to the question of when this is quantitatively 
valid. For convenience, we will also rescale time by a factor of L, since as we will see the natural scale of velocity in 
the above equation is O(L). This leads to 



(x - x) 4xfi _ /i L 

L L 2 L L 
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In a system with translation invariance, a fixed velocity pulse would be an exact solution of the governing equation 
of the traveling wave form 



P x {t) = p{x - vt) 
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Here, such a form will be approximately valid, to the extent that we can use a quasistatic approximation and treat 
the pulse shape as approximately constant in time. Later, we will discuss why this is valid at small velocity and how 
it breaks down at very large N. In detail, we set x = x[t) + z, v = % and obtain 



= p 



dt 

Z_ 4(x + z)\x _(*,fa 
L L 2 L L 



( 2{x 2 + 2xz + z 2 )fi u(x + z) z 

p [ v + l 2 — + H 

p" (»{x + z) , fs( 22 + z _ 2 {x + z)x\\ (4) 



L L V L 

We will focus on the specific example of the speed at the midpoint of the genomic landscape, x = ^, treating it as a 
constant parameter; extensions to other values will be mentioned below. Ignoring the dependence of the pulse shape 
on the time-dependent x is the technical manifestation of the aforementioned quasistatic approximation. We can drop 
terms of order ^ and If- as compared to j- , since the width of the pulse will turn out to be much greater than unity 
for large N . Finally, we ignore the z dependence in the diffusion constant as this purely mutational piece is smaller 
than the leading order (constant) contribution. This then leads to our basic pulse equation 



0= Pz -+ P ' z (y + F-)+-rt (5) 

where F = f s + /i- The solution to this equation must then be matched to the solution of the simpler equation that 
governs the region past the cutoff where p < P c . 

IV. WKB SOLUTION - LEADING ORDERS 

Our pulse equation is equivalent to a parabolic cylinder equation and so can be solved exactly and subsequently 
matched to the solution past the cutoff 24(]. This procedure would leave us with a complicated special- function 
equation for the velocity, which would have to be solved either numerically or via asymptotic approximations valid at 
large L. We find it more transparent to derive the needed approximations directly from our equation. We first rescale 
our equation, defining y — Fz/L. In terms of y our equation reads 



o = yp(y) + ^ P '(y) + ^p"(y) (6) 



where A = L/F 2 and is assumed large. Given that the highest derivative is multiplied by the small factor 1/A 2 it is 
natural to write down a WKB approximation for p, 

p(y) = Ce s ' A (7) 

To leading-order 

s' = 2(-v- y + V(v + y) 2 -y) (8) 

and so with the convention that S(0) = 0, we obtain 



S = 



-ivy - y 2 + (v + y - ^)V( v + y) 2 -y 

-v(v - i) - (v - ln(l - 4v) 

v-2) ln(-2« - 2y + 1 - 2y/{v + y) 2 - y) 



(9) 



C is a normalization constant that needs to be determined by the integral condition, p(z)dz = 1. The dom- 
inant contribution to the normalization integral arises from the region of the peak where we can use a quadratic 
approximation for S, S w — y 2 j2v. This gives us the very simple result C — ^^Lv ' 
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The velocity is determined by matching to the region to the right (R) of the cutoff point. The equation in this 
region is 

= (v + y) P ' R (y) + ^ K pUy) (10) 

the WKB solution to which is 

p R ~ C R e- A ^y^ (11) 
from which, using the condition p R (y c ) = Pc we obtain 
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-A(4v(y-y c )+2(y 2 -y 2 J) f^) 



Now, since S'(y c ) for the pre-cutoff solution does not equal S'(y c ) for the post-cutoff solution, there is in general no 
way that the two WKB expressions can satisfy the matching condition on the first derivative. The only resolution of 
this problem is if the pre-cutoff WKB solution breaks down before the cutoff, due to the presence of a turning point. 
The turning point, y*, is given by the discriminant condition (v + y*) 2 — = 0, so 



1 - 2v - \/l - 4u 

y* = 2 , 

S'„ = S'(y») = . (13) 

This allows us to calculate the amount of exponential decline from the peak at S' = to the turning point S' t by just 
plugging into the previous expression. 

p(y*) L 



+ {±-v)\n(l-4v) 



(14) 



Since, as we shall see, the turning point is in general close to y c , this already allows us to calculate the leading order 
(geometrical optics) expression for the velocity: 



l-4v 



3v - 1 + Vl - 4v H — ln(l - 4v) 



(15) 



For small v <C 1, this equation reduces to 

lnP c = -J^ 3 ( 16 ) 

so that the velocity scales as F 2 / 3 (— In Pc) 1 / 3 , which is what we obtained previously [lj} m a purely-mutational model 
with very large L. This is consistent with the observation that at x = L/2, mutation and recombination act essentially 
identically, the only difference being the addition fj,y term in the diffusion operator. For small v, is proportional to 
v, and so this extra term does not contribute. The new feature of the calculation not contained in the previous work 
is the presence of a limiting velocity of 1/4. We will discuss the significance of this particular value below. It should 
be noted that the above formula has the velocity achieving 1/4 at a finite, though very small value of P c , instead 
of the value we would expect. Thus according to this formula, the velocity is undefined for P c 's smaller than this 
value. The futher corrections we derive will rectify this anomaly. 

As is usually the case with WKB, we need to treat the region close to the turning point y = y* more carefully if 
we wish to obtain an accurate formula for the solution. Since this is the region wherein one has to be match to the 
solution past the cutoff, this care is indeed necessary. We let the solution p be given as 

p = e HVT=4t-i)(y-y.)) (() (17) 

where we have taken out of p the exponential dependence at the turning point, S^. Substituting this expression into 
the equation for p, it is easy to show that ip obeys the equation 

1 L<f > "(y) + y^tc! ) '(y) + VT^(y-y*)cb = (18) 
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The normal turning point type solution ignores the first derivative term thereby resulting in an Airy equation for 
The solution then is 



where the length scale S is given by 



= CiAi 



S = (2A)~ 2/3 (1 - Av) 



V* 



y 



-1/6 



(19) 



(20) 



With this, it is clear that the first derivative term is irrelevant as long as y/l — AvS is much larger than 0(1/A), or 
equivalently 1 — 4u <C 1/A = F 2 /L. Since L is always taken to be large, we see that the Airy approach will work 
for a range of N that becomes astronomically large. Beyond this point, one must resort to a much more complicated 
solution (involving parabolic cylinder functions; this pedantic exercise will not be presented here). We do mention in 
passing that it is possible to prove that v will not reach 1/4 until N actually reaches infinity. 

Since 6 ~ O (A~ 2 / 3 ), it is clear that the logarithmic derivative ofp is still dominated by AS 1 *. Thus the only way to 
achieve a match of the first derivatives is if to leading order p vanishes. Thus, (y» — y c )/S must be close to £o = —2.338, 
the location of the first zero of the Airy function. Thus gives an additional fall of e~ AS '^° s to the magnitude of p 
over the distance between and y c . This is the next order, 0(A 1 ^ 3 ) contribution to the velocity relation, which now 
reads: 



InP, 



L 

2F 2 



, 1 — Av 

3v - 1 + VI - Av H — Ln(l - Av) 

+ ( i f,) 1/3 (l-VT^)eo(l-4,)- 1 /a 



(21) 



FULL WKB SOLUTION 



So far, we have calculated the first two orders in the velocity for large L. A full WKB approach should, however, 
consider all the terms that do not vanish in the L — > oo limit. To achieve such a solution, we have to obtain the 
physical optics WKB expression, and match to the post-cutoff solution. A full discussion of this procedure applied to 
a related model is given in Ref. El Here we will be brief, outlining the steps without additional commentary. The 
physical optics WKB solution is 



P = C 



~{y + vf -y~ 


-1/4 


1 - 2y - 2v - 2 v /(y + v) 2 - y 


[ V 2 




1 - Av 



1/2 



(22) 



This expression is obtained in the standard manner by just going to next order in the WKB expansion. This then 
has to be matched to a more accurate solution near the critical point. Taking into account the modification due to 
the small first derivative term in Eq. 1|18[) . we can derive 



The matching yields 



p=d [l - 2A(y - y*) 2 ] Ai - 



C X = C 



y-y* 



2A5' 



2J¥t5 1 ^ 



,AS. 



(1 - Av) 3 / S 



(23) 



(24) 



We now have to match this to the post-cutoff solution, which has a logarithmic derivative (w.r.t. y) of — 4A(y c + v) 
to leading order. As noted above, this fixes the location of y c : 



lie 



y* - Zo5 + 2AS 3 - 



1 



A(l - y/1 - Av) 



y* - £o<5 + 



1 



2AV1 - Av A(l - VI - 4w) 



(25) 



where the third term comes from the shift in the argument of the Airy function, and the last term from the small 
distance from the zero of the Airy. The condition that p(y c ) = Pc then gives 



P c = dAi% 



1 

A(l- Vl - 



(26) 



6 



0.3 
0.25 

0.2 
=i 0.15 

0.1 

0.05 


10 



— t 1 








— — _ ~ — — _ 








s 














: 




*\\ \ 




First Order 


\\ \ 




Second Order 


V v 




Third Order 


\ v 1 




• Simulation 


\ x ~ 
\ \ 

\ v _ 




i i 


1 



FIG. 1: Comparison of the analytic approximation for the velocity at x = L/2 as a function of the cutoff P c , Eq. 1271 . ("Third 
Order") with the results of direct simulations of the time-dependent spatially-discrete equation, Eq. Q. In addition, we 
show the lower order approximations, Eqs. 11511 ("First Order") and 1211 1 ("Second Order"). The parameters employed are: 
L = 1000, F = 2.1. (color online) 



Putting this all together gives us our final expression for the velocity: 



lnP r = 



L 
2F 2 " 



, 1 — Av 

3v - 1 + Vl - Av H — ln(l - Av) 



f L \ 1/3 (l- yT=4lT)g 



V 4F 2 J 



+ ln 




(1-4«)V« 
1/3 Ai'(fo) 



1 - VI - 4u 



(l-4u) 1 /6(i_ x /l-4«) 



2y1 - 4w 



(27) 



We compare our current expression for the velocity with numerical simulation data obtained by solving the original 
time-dependent pde and measuring the pulse speed when the mean fitness passes L/2. The results are presented in 
Fig. 1, where we in addition display the lower order approximations, Eqs. (|15|l and J2J. We see that these additional 
contributions are indeed significant, even for the relatively large L employed. 

One interesting fact that emerges from this analysis concerns the extent to which the selected velocity depends 
on the nature of the imposed cutoff. Recall that our cutoff dropped the entire birth-death term in the region past 
P c . Since the first two terms in Eq. 1271 arose simply by requiring that p approach zero at the cutoff, these would be 
unchanged had we employed a different treatment, say dropping only the birth term. This is also true of the 0(\nL) 
piece. Only the 0(1) contribution is modified. In detail, in we were to drop only the birth term, the distance of y c to 



the zero of the Airy function is modified, and is now 



A\/4-2Vl-4u-4« 



so that the velocity-cutoff relation now reads: 



In P. = 



L 
2F2 



, 1 — 4v 

3v - 1 + Vl - 4v H — ln(l - Av) 



f L \ 1/3 (1-VT^)Zq 



\4F 2 J 



In 



(1 - 4u) 1 /6 



( 


"16F" 


1/3 




L 2 





Ai'(6>) 



1 - VI -4w 



1 - Vl - 4w 

(1 - 4u) 1 /6( v /4 - 2V4 - 4^ - Av) J ' ^A - 2^/1^ 4^ - Av 2V1 - Av 



(28) 



In Fig. |2 we plot this modified velocity together with the original. We see that the general features of the two models 
are similar, and that the modification results, for a given v, in an effective decrease in P c by a factor of about 10. 
This could be absorbed in our phenomenological parameter k relating P c to the N in the stochastic model. 

Let us return to the scaling implied by the analytic formula. From the above, it is clear that the large N limit in 
which v asymptotes to 1 /4 is denned by 



N > N r . 



(29) 



This is a very strong function of F; one can reach the large population limit velocity at much smaller size by making 
a relatively modest increase in F. In this model, this behavior is what is responsible to the efficacy of recombination 
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FIG. 2: Velocity vs. the cutoff, P c , when only birth and not death is excluded beyond the cutoff. For comparison, the velocity 
from the original model with both birth and death excluded beyond the cutoff, is presented as well. The results of simulation 
(circles) as well as the analytic WKB prediction (lines) are displayed. Parameters are L = 1000, F = 2. (color online) 

in improving the rate of evolution. Essentially, fitness improvement rates will be limited by clonal interference as N 
becomes moderately large; different beneficial mutations arise in different lineages and cannot be jointly selected for. 
This can only be overcome if the variance of the population is so large that all different combinations of multiple 
mutants are present simultaneously or if recombination succeeds in collecting these different mutations in a common 
line. This reduces the needed width, which is proportional to IniV, by a inverse factor of F 2 . 

Now, the fact that mutation and recombination act similarly (adding up to give F) is a feature of the fact that we 
have chosen to do our calculation at the symmetric point x — L/2. If the mean fitness is above the midpoint, the 
effects of \l and f s diverge. In particular, if we return to the original pulse equation (4), we see that there is a drift 
term proportional to fi that increases with x; this term is merely the fact that in this part of the landscape most 
mutations are detrimental. Crucially, there is no such term proportional to f s arising from the recombination process. 
Thus, one cannot in general get to large positive velocity by increasing the mutation rate; the bias will win out. One 
must resort to a recombination strategy; there will be no recombination load as long as we assume (as we have) that 
genes are entities that cannot be broken apart by recombination. 

Looking more closely at what happens away from the symmetric point x, — L/2, for simplicity we drop the 
mutation term and concentrate solely on the recombination. There is now an addition term Fz(l — 2a) / Lp" (z) = 
(F 4 j L 2 )p" (y)(y(l — 2a) /F) in the steady-state equation. The calculation is similar; note however that now the 
equation is no longer a parabolic cylinder equation and no exact solution is possible; nonetheless our WKB treatment 
still works. Qualitatively the picture is the same. The additional term in the equation is irrelevant to small velocities, 
becoming more important as the velocity increases. The most interesting question is what happens to the limiting 
velocity. The limiting velocity is given in general by the point where dS'^/dv diverges. A straightforward, but messy, 
calculation yields the result 

Voo = a{l - a) (y^y (l - \fi - 2(1 - 2q)/f) (30) 

where a = x/L. Note first that this is the "binomial" answer, a(l — a), times a correction factor which is a function 
of the ratio F/(l — 2a). For large F, this correction factor approaches unity, and we recover the "binomial" answer. 
This is consistent with our observation |23j| that the effect of the recombination is to drive the system to the binomial 
distribution. For a > 1/2, the limiting velocity is less than the binomial result, whereas for a < 1/2 it is always 
greater. For F < 2(1 — 2a), our formula does not predict any limiting velocity. This just means that the velocity is 
controlled by the right boundary at y = 1 — a and not by the cutoff. We have verified our formula numerically for a 
number of different a. 

It is worthwhile considering the origin of this deviation from the "binomial" answer for infinite N, in light of the 
easily verified fact that the binomial distribution with velocity a(l — a) is an exact solution of the fully time-dependent 
equations in the absence of a cutoff, for any value of F. The answer is that we are seeing an effect of the breakdown 
of the quasi-static approximation for large u's, of order 1 in our scaling. Thus, when a < 1/2, when the dynamic 
interface is accelerating, the velocity does not have a chance to reach the quasistatic answer before it has to move 
on, and hence the dynamic velocity is less than the quasi-static prediction. Similarly, for a > 1/2, the interface is 
decelerating, and hence the velocity is greater than the quasistatic prediction. Only when v <C 1 is the quasistatic 
approximation quantitatively valid; this limitation also emerges if one just estimates the terms being dropped in the 
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original pulse equation. We do not know at present of an analytical method which can go beyond this limitation, 

A last issue that is worth mentioning is the use of the space continuum approximation for our evolution system 
originally defined on a lattice. We have shown elsewhere that lattice effects can become important if the effective 
diffusion constant (here equal to F/A) is small. Out interest here is in the effect of taking F fairly large by having 
a significant recombination rate; in this range, the continuum approximation is quantitatively justified. This can of 
course be seen a posteriori by the agreement between the calculations and the numerical velocity data. 

VI. DISCUSSION 

This paper has been devoted to an analytic treatment of a previously developed model for evolution at finite 
population size with both mutation and one specific type of genetic recombination. Our results indicate that the 
rate of evolution can be dramatically speeded up by recombination as we approach the answers that would pertain in 
the infinite population size limit, where all genomes are almost immediately populated and selection alone accounts 
for the (unnormalized) speed of L/A. If mutations were strictly unbiased, similar effects could be had by a greatly 
increased mutation rate. But, as soon as we move up the fitness landscape towards the eventual equilibrium state, 
deleterious mutations are more common and increasing the rate increases the mutational load. In our model, there is 
no rccombinational load. 

In or der to se e the effects of recombination, our basic rate of recombination events per gene, denoted by f s must be 
order i/L/lniV. In comparing to real systems, L is the number of genes contributing to possible fitness changes in a 
novel environment and N is of course the population size. The rate itself is measured with respect to the differential 
birth rate increase due to the fixation of one beneficial allele. Our feeling is that it may be premature to try to compare 
this theory quantitatively to any specific set of experiments, but nonetheless the basic trends should be verifiable in 
microorganism experiments. 

Future work must address a number of points that are absent in our simple model but presumably present in the 
real microbial world. First, is the role of lethal mutations in biasing the genetic composition of the environment; some 
deaths are due to bad genes and these could be picked up by DNA importation. This is a form of recombination load, 
i.e. a bias towards the negative in the recombination process. A second issue along the same lines concerns the fact 
that homologous recombination may occur in the middle of a coding sequence and will bring together incompatible 
fragments. In our model this would appear as an epistatic interactions among the sites. Finally, our analytic method 
assumes we can write down an equation solely in terms of the phenotype distribution P x (t) . This clearly breaks down 
at small N where different alleles have very different population distributions and we do not as yet have a method 
which can reach into this region. 
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